18 research outputs found

    Scheduling Independent Tasks on Multi-cores with GPU Accelerators

    Get PDF
    Best PaperInternational audienceMore and more computers use hybrid architectures combin-ing multi-core processors and hardware accelerators like GPUs (Graphics Processing Units). We present in this paper a new method for scheduling efficiently parallel applications with mm CPUs and kk GPUs, where each task of the application can be processed either on a core (CPU) or on a GPU. The objective is to minimize the makespan. The corresponding scheduling problem is NP-hard, we propose an efficient approximation algorithm which achieves an approximation ratio of 43+13k\frac{4}{3} + \frac{1}{3k} . We first detail and analyze the method, based on a dual approximation scheme, that uses a dynamic programming scheme to balance evenly the load between the heterogeneous resources. Finally, we run some simulations based on realistic benchmarks and compare the solution obtained by a relaxed version of this method to the one provided by a classical greedy algorithm and to lower bounds on the value of the optimal makespan

    Scheduling independent tasks on multi-cores with GPU accelerators

    Get PDF
    International audienceMore and more computers use hybrid architectures combining multi-core processors and hardware accelerators like GPUs (Graphics Process-ing Units). We present in this paper a new method for scheduling efficiently parallel applications with m CPUs and k GPUs, where each task of the appli-cation can be processed either on a core (CPU) or on a GPU. The objective is to minimize the maximum completion time (makespan). The corresponding scheduling problem is NP-hard, we propose an efficient approximation algo-rithm which achieves an approximation ratio of 4 3 + 1 3k . We first detail and analyze the method, based on a dual approximation scheme, that uses dynamic programming to balance evenly the load between the heterogeneous resources. Then, we present a faster approximation algorithm for a special case of the previous problem, where all the tasks are accelerated when affected to GPU, with a performance guarantee of 3 2 for any number of GPUs. We run some simulations based on realistic benchmarks and compare the solutions obtained by a relaxed version of the generic method to the one provided by a classical scheduling algorithm (HEFT). Finally, we present an implementation of the 4/3-approximation and its relaxed version on a classical linear algebra kernel into the scheduler of the xKaapi runtime system

    A study of scheduling problems with preemptions on multi-core computers with GPU accelerators

    Get PDF
    International audienceFor many years, scheduling problems have been concerned either with parallel processor systems or with dedicated processors-job shop type systems. With a development of new computing architectures this partition is no longer so obvious. Multi-core (processor) computers equipped with GPU co-processors require new scheduling strategies. This paper is devoted to a characterization of this new type of scheduling problems. After a thorough introduction of the new model of a computing system, an extension of the classical notation of scheduling problems is proposed. A special attention is paid to preemptions, since this feature of the new architecture differs the most as compared with the classical model. In the paper, several scheduling algorithms, new ones and those refining classical approaches, are presented. Possible extensions of the model are also discussed

    Scheduling Independent Moldable Tasks on Multi-Cores with GPUs

    Get PDF
    The number of parallel systems using accelerators is growing up.The technology is now mature enough to allow sustainedpetaflop/s. However, reaching this performance scale requiresefficient scheduling algorithms to manage the heterogeneouscomputing resources.We present a new approach for scheduling independent tasks onmultiple CPUs and multiple GPUs. The tasks are assumed to beparallelizable on CPUs using the moldable model: the final numberof cores allotted to a task can be decided and set by thescheduler. More precisely, we design an algorithm aiming atminimizing the makespan---the maximum completion time of alltasks---for this scheduling problem. The proposed algorithmcombines a dual approximation scheme with a fast integer linearprogram (ILP). It determines both the partitioning of the tasks,ie whether a task should be mapped to CPUs or a GPU, and thenumber of CPUs allotted to a moldable task if mapped to the CPUs.A worst case analysis shows that the algorithm has anapproximation ratio of 32+ϵ\frac{3}{2} + \epsilon. However, sincethe complexity of the ILP-based algorithm could benon-polynomial, we also present a proved polynomial-timealgorithm with an approximation ratio of 2+ϵ2+\epsilon.We complement the theoretical analysis of our two novelalgorithms with an experimental study. In these experiments, wecompare our algorithms to a modified version of the classical\heft algorithm, adapted to handle moldable tasks. Theexperimental results show that our algorithm with the32+ϵ\frac{3}{2} + \epsilon approximation ratio producessignificantly shorter schedules than the modified \heft for mostof the instances. In addition, the experiments provide evidencethat this ILP-based algorithm is also practically able to solvelarger problem instances in a reasonable amount of time

    Ordonnancement pour les nouvelles plateformes de calcul avec GPUs

    No full text
    More and more computers use hybrid architectures combining multi-core processors (CPUs) and hardware accelerators like GPUs (Graphics Processing Units). These hybrid parallel platforms require new scheduling strategies. This work is devoted to a characterization of this new type of scheduling problems. The most studied objective in this work is the minimization of the makespan, which is a crucial problem for reaching the potential of new platforms in High Performance Computing. The core problem studied in this work is scheduling efficiently n independent sequential tasks with m CPUs and k GPUs, where each task of the application can be processed either on a CPU or on a GPU, with minimum makespan. This problem is NP-hard, therefore we propose approximation algorithms with performance ratios ranging from 2 to (2q+1)/(2q)+1/(2qk), q>0, and corresponding polynomial time complexities. The proposed solving method is the first general purpose algorithm for scheduling on hybrid machines with a theoretical performance guarantee that can be used for practical purposes. Some variants of the core problem are studied: a special case where all the tasks are accelerated when assigned to a GPU, with a 3/2-approximation algorithm, a case where preemptions are allowed on CPUs, the same problem with malleable tasks, with an algorithm with a ratio of 3/2. Finally, we studied the problem with dependent tasks, providing a 6-approximation algorithm. Experiments based on realistic benchmarks have been conducted. Some algorithms have been integrated into the scheduler of the xKaapi runtime system for linear algebra kernels, and compared to the state-of-the-art algorithm HEFT.De plus en plus d'ordinateurs utilisent des architectures hybrides combinant des processeurs multi-cœurs (CPUs) et des accélérateurs matériels comme les GPUs (Graphics Processing Units). Ces plates-formes parallèles hybrides exigent de nouvelles stratégies d'ordonnancement adaptées. Cette thèse est consacrée à une caractérisation de ce nouveau type de problèmes d'ordonnancement. L'objectif le plus étudié dans ce travail est la minimisation du makespan, qui est un problème crucial pour atteindre le potentiel des nouvelles plates-formes en Calcul Haute Performance.Le problème central étudié dans ce travail est le problème d'ordonnancement efficace de n tâches séquentielles indépendantes sur une plateforme de m CPUs et k GPUs, où chaque tâche peut être exécutée soit sur un CPU ou sur un GPU, avec un makespan minimal. Ce problème est NP-difficiles, nous proposons donc des algorithmes d'approximation avec des garanties de performance allant de 2 à (2q + 1)/(2q) +1/(2qk), q> 0, et des complexités polynomiales. Il s'agit des premiers algorithmes génériques pour la planification sur des machines hybrides avec une garantie de performance et une fin pratique. Des variantes du problème central ont été étudiées : un cas particulier où toutes les tâches sont accélérées quand elles sont affectées à un GPU, avec un algorithme avec un ratio de 3/2, un cas où les préemptions sont autorisées sur CPU, mais pas sur GPU, le modèle des tâches malléables, avec un algorithme avec un ratio de 3/2. Enfin, le problème avec des tâches dépendantes a été étudié, avec un algorithme avec un ratio de 6. Certains des algorithmes ont été intégré dans l'ordonnanceur du système xKaapi

    Scheduling for new computing platforms with GPUs

    No full text
    De plus en plus d'ordinateurs utilisent des architectures hybrides combinant des processeurs multi-cœurs (CPUs) et des accélérateurs matériels comme les GPUs (Graphics Processing Units). Ces plates-formes parallèles hybrides exigent de nouvelles stratégies d'ordonnancement adaptées. Cette thèse est consacrée à une caractérisation de ce nouveau type de problèmes d'ordonnancement. L'objectif le plus étudié dans ce travail est la minimisation du makespan, qui est un problème crucial pour atteindre le potentiel des nouvelles plates-formes en Calcul Haute Performance.Le problème central étudié dans ce travail est le problème d'ordonnancement efficace de n tâches séquentielles indépendantes sur une plateforme de m CPUs et k GPUs, où chaque tâche peut être exécutée soit sur un CPU ou sur un GPU, avec un makespan minimal. Ce problème est NP-difficiles, nous proposons donc des algorithmes d'approximation avec des garanties de performance allant de 2 à (2q + 1)/(2q) +1/(2qk), q> 0, et des complexités polynomiales. Il s'agit des premiers algorithmes génériques pour la planification sur des machines hybrides avec une garantie de performance et une fin pratique. Des variantes du problème central ont été étudiées : un cas particulier où toutes les tâches sont accélérées quand elles sont affectées à un GPU, avec un algorithme avec un ratio de 3/2, un cas où les préemptions sont autorisées sur CPU, mais pas sur GPU, le modèle des tâches malléables, avec un algorithme avec un ratio de 3/2. Enfin, le problème avec des tâches dépendantes a été étudié, avec un algorithme avec un ratio de 6. Certains des algorithmes ont été intégré dans l'ordonnanceur du système xKaapi.More and more computers use hybrid architectures combining multi-core processors (CPUs) and hardware accelerators like GPUs (Graphics Processing Units). These hybrid parallel platforms require new scheduling strategies. This work is devoted to a characterization of this new type of scheduling problems. The most studied objective in this work is the minimization of the makespan, which is a crucial problem for reaching the potential of new platforms in High Performance Computing. The core problem studied in this work is scheduling efficiently n independent sequential tasks with m CPUs and k GPUs, where each task of the application can be processed either on a CPU or on a GPU, with minimum makespan. This problem is NP-hard, therefore we propose approximation algorithms with performance ratios ranging from 2 to (2q+1)/(2q)+1/(2qk), q>0, and corresponding polynomial time complexities. The proposed solving method is the first general purpose algorithm for scheduling on hybrid machines with a theoretical performance guarantee that can be used for practical purposes. Some variants of the core problem are studied: a special case where all the tasks are accelerated when assigned to a GPU, with a 3/2-approximation algorithm, a case where preemptions are allowed on CPUs, the same problem with malleable tasks, with an algorithm with a ratio of 3/2. Finally, we studied the problem with dependent tasks, providing a 6-approximation algorithm. Experiments based on realistic benchmarks have been conducted. Some algorithms have been integrated into the scheduler of the xKaapi runtime system for linear algebra kernels, and compared to the state-of-the-art algorithm HEFT

    Scheduling Tasks with Precedence Constraints on Hybrid Multi-core Machines

    No full text
    International audienceIn this work, we are interested in scheduling dependent tasks for hybrid parallel multi-core machines, composed of CPUs with additional accelerators (GPUs). The objective is to minimize the make span, which is a crucial problem for reaching the potential of new platforms in High Performance Computing. We provide an approximation algorithm with a performance guarantee of 6 to solve this problem. The algorithm is a two-phase solving method: a first phase based on rounding the solution provided by solving a linear programming formulation for the assignment of the tasks to the resources. A second phase uses a classical list algorithm to schedule the tasks according to the assignment phase. The proposed approach is the first generic algorithm with a performance guarantee for scheduling tasks with precedence constraints on hybrid platforms with CPUs and GPUs resources
    corecore